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LLP 

(57) ABSTRACT 

A method is provided for load balancing requests for a 
replicated service or application among a plurality of servers 
operating instances of the replicated service or application. 
A policy is selected for choosing a preferred server from the 
plurality of servers according to one or more specified status 
or operational characteristics of the servers, such as the 
least-loaded or closest server. The policy is encapsulated 
within multiple levels of objects or modules that are dis- 
tributed among the servers offering the replicated service 
and a central server that receives requests for the service. 
Status objects gather or retrieve information concerning the 
specified status or operational characteristic^) of each of the 
plurality of servers. An individual server monitor object 
operates for each instance of the replicated service to invoke 
one or more status objects and receive the necessary infor- 
mation. A central replicated monitor object receives the 
information from each individual server monitor object. The 
information from the servers is analyzed to select the server 
having the optimal status or operational characteristics). An 
update object updates the central server, such as a domain 
name server, to indicate the preferred server. Requests for 
the replicated service are then directed to the preferred 
server until a different preferred server is identified. 

30 Claims, 5 Drawing Sheets 
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LOAD BALANCING FOR REPLICATED 
SERVICES 

BACKGROUND 

This invention relates to the field of computer systems. 5 
More particularly, a system and methods are provided for 
load balancing among replicated services using policies. 

In many computing environments, clients such as com- 
puter systems and users connect to computer servers offering 
a desired service — such as electronic mail or Internet brows- 10 
ing. One computer server may, however, only be capable of 
efficiently satisfying the needs of a limited number of 
clients. In such a case, an organization may employ multiple 
servers offering the same service, in which case the client 
may be connected to any of the multiple servers in order to 15 
satisfy the client's request. 

A service offered simultaneously on multiple servers is 
often termed "replicated" in recognition of the fact that each 
instance of the service operates in substantially the same ^ 
manner and provides substantially the same functionality as 
the others. The multiple servers may, however, be situated in 
various locations and serve different clients. In order to 
make effective use of a replicated service offered by multiple 
servers (e.g., to satisfy clients' requests for the service), 
there must be a method of distributing clients' requests 
among the servers. This process is often known as load 
balancing. 

In one method of load balancing, clients* requests arc 
assigned to the servers offering the replicated service on a 30 
round-robin basis. In other words, client requests are routed 
to the servers in a rotational order. Each instance of the 
replicated service may thus receive substantially the same 
number of requests as the other instances. Unfortunately, 
this scheme can be very inefficient. 35 

Because the servers that offer the replicated service can be 
geographically distributed, a client's request may be routed 
to a relatively distant server, thus increasing the transmission 
time and cost incurred in submitting the request and receiv- 
ing a response. In addition, the processing power of the 40 
servers may vary widely. One server may, for example, be 
capable of handling a larger number of requests or be able 
to process requests faster than another server. As a result, the 
more powerful server may periodically be idle while the 
slower server is overburdened. 45 

In another method of load balancing, specialized hard- 
ware is employed to store information concerning the serv- 
ers offering the replicated service. In particular, this method 
stores information, on a computer system other than the 
system that initially receives client requests, about which of 50 
the servers has the smallest load (e.g., fewest client 
requests). Based on that information a user's request is 
routed to the least-loaded server. In a web-browsing 
environment, for example, when a user's service access 
request (e.g., a connection request to a particular Uniform 55 
Resource Locator (URL) or virtual server name) is received 
by a server offering Domain Name Services (DNS), the DNS 
server queries or passes the request to the specialized 
hardware. Based on the stored information, the user's 
request is then forwarded to the least-loaded server offering 60 
the requested service. 

This method is also inefficient because it delays and adds 
a level of complexity to satisfying access requests. In 
particular, one purpose of a DNS server is to quickly resolve 
a client's request for a particular service to a specific server 65 
(e.g., a specific network address) offering the service. 
Requiring the DNS server to query or access another server 
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in order to resolve the request is inefficient and delays the 
satisfaction of the request. 

In yet other methods of balancing requests among mul- 
tiple instances of a replicated service, client requests are 
randomly assigned to a server or are assigned to the closest 
server. Random assignment of client requests often results in 
requests being routed to geographically distant servers or 
servers that are more burdened than others, thus resulting in 
unnecessary delay. Assigning requests to the closest server 
is also inefficient because a faster response may be available 
from a server that, although further from the client, has less 
of a load. 

In addition to the above disadvantages of present load 
balancing techniques, present techniques are limited in 
scope. For example, in the methods described above, load- 
balancing decisions are made solely on the basis of opera- 
tional statistics concerning the servers offering a replicated 
service, not the status of the service itself. In other words, 
present techniques do not provide for the collection or 
consideration of information concerning the status of indi- 
vidual applications or services executing on the servers. 
Thus, a client's request for a particular application or service 
may be routed to a first server that has less of an overall load 
than a second server, even though the specific application 
request could be more efficiently and/or rapidly handled by 
the second server. 

SUMMARY 

In one embodiment of the invention a system and methods 
are provided for balancing client (e.g., user) requests among 
multiple instances of a replicated service or application in 
accordance with a selected policy. In this embodiment, 
instances of the replicated service execute on separate 
computer servers. 

A load balancing policy is selected to specify one or more 
factors to be used in determining the server (e.g., one of 
multiple servers offering a replicated service) that is to 
receive a client request. The identity of the "preferred" 
server is periodically updated in order to distribute requests 
for the service or application among the multiple servers. 
Illustrative policies include selecting the least-loaded or 
closest server. Illustratively, the least-loaded server is the 
server having the shortest response time or fewest pending 
client requests and the closest server is the server that can be 
reached in the fewest network hops or connections. 

Depending upon the selected policy, status objects or 
modules are created to collect information from each server 
offering the replicated service or application that is being 
load-balanced. The information collected from each server 
may include the number of requests held and/or processed 
by the server or service, the response time and/or operational 
status (e.g., is it up or down) of the server or service, the 
distance (e.g., the number of network hops) to the server, etc. 

Each instance of a replicated service or application is 
associated with its own status object(s). In one embodiment 
of the invention multiple status objects having different 
functions are associated with one instance. Each instance of 
the replicated service is also associated with an individual 
monitor object (IMO) or module. Each I MO thus collects 
and saves information from the status object(s) of one 
service instance. Illustratively, the IMO queries its status 
object(s) on a periodic basis and stores the information that 
is returned. 

A replicated monitor object (RMO) or module is 
employed to collect information from the IMOs associated 
with the various instances of the replicated service. The 
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RMO stores this information, which is then processed to limited to the embodiments shown, but is to be accorded the 

identify a preferred server (e.g., least-loaded or closest). widest scope consistent with the principles and features 

In an embodiment of the invention in which clients access disclosed herein, 

the replicated service through a system such as a Domain In particular, illustrative embodiments of the invention are 

Name Service (DNS) server, a DNS updater object or 5 described in the context of browsing the worldwide web or 

module updates a DNS zone file to identify the preferred other Internet content and services. These embodiments of 

<je1rver(e:f4^bylils.nerw^ A DNS zone file m ay the invention therefore involve the use of Domain Name 

be used to resolve a^vjrtujd^erver-name-^g— a- virtual Services (DNS) to resolve access requests to virtual server 

^dentit y-of-a~seTvicej a replicated on multiple serve rs) to a names into addresses of physical machines ;such as computer 

particular server. \V^n^cUentrequests.a-replicated-service 10 servers. One skilled in the art will appreciate that a DNS 

accessed-via-a"virtual name^^t^DNS-seiyerCdrrecjsjjL^ server may therefore be used to balance or distribute 

request„to4rie-serv^rindic^ted-m-tl^^riej^le^ requests among multiple web or Internet servers. One skilled 

In one embodiment of the invention the status objects, in the art will also recognize that the present invention is not 

IMOs, the RMO and the DNS updater are co-located (e.g., limited to such an environment but may be readily adapted 

on a DNS server). Illustratively, the servers and replicated 1S to other environments in which load balancing is required 

services need not be modified in this non-intrusive mode of for a replicated service or application program, 

operation. The status objects use network functions or The program environment in which a present embodiment 

commands (e.g., Ping, Connect) to retrieve operational and of the invention is executed illustratively incorporates a 

load information concerning a server (e.g., the response time general-purpose computer or a special purpose device such 

of a server, whether a server or service is up or down). In an 20 a hand-held computer. Details of such devices (e.g., 

alternative embodiment of the invention an intrusive mode processor, memory, data storage and display) are well 

of operation is enabled in which the status object(s) and known and are omitted for the sake of clarity. 

IMOs execute on individual servers that operate instances of it should also be understood that the techniques of the 

a replicated service or application. In this alternative present invention might be implemented using a variety of 

embodiment the RMO and DNS updater may remain on the 25 technologies. For example, the methods described herein 

DNS server. may be implemented in software running on a computer 

In another alternative embodiment of the invention a system, or implemented in hardware utilizing either a corn- 
specialized object or module other than a DNS updater is bination of microprocessors or other specially designed 
generated to act upon the selection of a preferred server. In 3Q application specific integrated circuits, programmable logic 
this alternative embodiment, the specialized object is con- devices, or various combinations thereof. In particular, the 
figured to update data structures or otherwise cause the methods described herein may be implemented by a series of 
direction or re-direction of load-balanced requests to the computer-executable instructions residing on a storage 
preferred server. medium such as a carrier wave, disk drive, or computer- 
readable medium. In addition, although specific embodi- 
DESCRIPTION OF THE FIGURES ments of the invention are described using object-oriented 

-| • ii i i* . • .„ , . . software programming concepts, the invention is not so 

FIG. 1 is a block diagram depicting an illustrative envi- . . *\ r •, j . j . i .u c c 

... * i • . e ° t . t ■ limited and is easily adapted to employ other forms or 

ronment in which an embodiment of the present invention . .. • * 

. • , lui r. . directing the operation of a computer, 

may be implemented to load balance client requests among t» r r 

multiple servers. 40 ' n a P resent embodiment of the invention, information 

m^>-»- uiij- • iLJ ri_i • concerning the operation of computer servers executing a 

FIG. 2 is a block diagram depicting a method of balancing t ° . r . ,. . j *, . . j tC 

... , i.- i • • . ■ replicated service is collected and processed to identify a 

client requests among multiple servers in a non-intrusive n JLf-~-A , „ t . „„„„ „Iu Pfrt ,iu cf 

? , 9 . , r ... „f.. t preferred server (e.g., the server with the smallest load or 

manner in accordance with an embodiment of the present . _ . \. b \ ... , . - . c 

r shortest response tune). Illustrative pieces of information 

invention 

4S that are collected include a server's response time, its 

FIG. 3 is a block diagram depicting a method of balancing distance from a central server (such as a name server 

client requests among multiple servers in an intrusive man- providing DNS services), its operational status (e.g., 

ner in accordance with an embodiment of the present whether it is up or down), etc. 

invention. P Qf p Ur p 0ses 0 f me p re sent invention a replicated service 

FIG. 4 is a block diagram depicting a method of balancing 50 ^ a service (e.g., web browsing, electronic mail) that is 

client requests among geographically dispersed servers in available on multiple servers. For example, an organization 

accordance with an embodiment of the present invention. providing a service or application that is visited or invoked 

FIG. 5 is a flow chart demonstrating one method of by numerous clients may employ several web servers to 

establishing a system for load balancing client requests for handle the requests. Each of the several servers is considered 

a replicated server or application in accordance with an 5S to operate a separate instance of the replicated service or 

embodiment of the present invention. application. Individual users may thus be routed to, and their 



DETAILED DESCRIPTION 



requests satisfied by, any of the several servers. 
The collected information is then analyzed and a preferred 
The following description is presented to enable any server is identified in accordance with a selected policy. In 
person skilled in the art to make and use the invention, and 60 accordance with one illustrative policy, the preferred server 
is provided in the context of particular applications of the is the server that is least-loaded. Another policy identifies the 
invention and their requirements. Various modifications to preferred server as being the closest server. After the pre- 
the disclosed embodiments will be readily apparent to those ferred server is identified, subsequent requests for the rep- 
skilled in the art and the general principles defined herein licated service or application are directed to that server. For 
may be applied to other embodiments and applications 65 example, in a web-browsing environment a DNS lookup 
without departing from the spirit and scope of the present table, or zone file, is updated to indicate that requests for the 
invention. Thus, the present invention is not intended to be replicated service are to be routed to the preferred server. 
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The information described above is collected, and a new illustratively by nameserver 100, in order to perform a 

preferred server identified, on a regular or periodic basis. By variety of actions (e.g., load or mount an alternate Internet 

periodically changing the preferred server, client requests or domain namespace). 

are load -balanced between the participating servers. In a present embodiment of the invention information 

In an alternative embodiment of the invention, load bal- 5 reflecting the status or operation of servers 110, 112 and 114 

ancing is still performed among applications or replicated is collected and analyzed in accordance with the selected 

services receiving multiple client requests, but the informa- policy to identify a "preferred" server to be exposed to 

tion used to identify a preferred server or preferred instance clients via zone file 104. The various pieces of information 

of the application relates to the application rather than the that may be collected illustratively include: whether a server 

server. In this alternative embodiment, for example, a data- to 0 r instance of a replicated service is operational; the 

base application may be modified to track statistics such as response time for a request submitted to a server or service 

the number of users being serviced by each instance of the instance; the number of requests processed by or pending on 

application or the number of access requests that are pending a server or service instance, a server's proximity (e.g., the 

with each instance. Requests may then be balanced among number of network hops necessary to reach the server from 

the instances by comparing the load handled by each one 15 nameserver 100), etc. In one embodiment of the invention, 

and, for example, selecting the server having the least- a series of computer-readable instructions are executed to 

loaded instance (e.g., the instance having the fewest incom- collect, assemble and analyze the various pieces of infor- 

plete requests) to receive new requests. mation and to update DNS zone file 104. 

FIG. 1 is a block diagram depicting an illustrative envi- Advantageously, the computer-readable instructions lake 

ronment in which an embodiment of the invention may be 20 the form of executable objects or modules. The objects or 

implemented to balance web browsing access requests to an modules are illustratively created in a suitable programming 

internet service among multiple web servers. Illustratively, language or script and then configured and installed on 

names 100 is a computer offering Domain Name Services nameserver 100. In alternative embodiments of the 

(DNS) with DNS 102. Back-end servers 110, 112 and 114 invention, the executable objects or modules are distributed 

are web servers offering a replicated Internet service. 25 among nameserver 100 and servers 110, 112 and 114. 

Nameserver 100 includes zone file 104, which is used to FIG. 2 depicts an illustrative non- intrusive embodiment of 

resolve requests for the replicated service to an address of a the invention in which operational and statistical informa- 

server offerin g the re quested seryj ce^Zone~file~104nhus y tion is collected from servers 110, 112 and 114 and analyzed 

includes^an^ entr,y_for„a^v.irtual— seryer^ name~( e7g^ — ^ on nameserver 100 using executable objects installed on 

'wwwSun x^^^ nameserver 100. 

to-allow-triemjicMs£t6^ In this mode of operation, status objects 200a, 2006 and 

sWver-H2jgrg^ry^ are invoked on nameserver 100 for the purpose of 

ww^sxm^mjS^mdicate-a-net information from servers 110, 112 and 114, 

Ititenret p7otb^Laddrere)If^ respectively. The configuration and purpose of the status 

114^ervers J10rll2~and^T4~may^^^ -objects depend upon the policy that has been selected for 

^-majejjoxj ne another (e.g., geographicallY_orJogicallv): Choosing a preferred server. For example, where the selected 

Qient 120 is illustratively a personal computer or work- policy requires choosing the least-loaded server (e.g., that 

station configured to provide a user access to a network (e.g., which has the fastest response time), each status object 

the Internet) and various applications and services on servers 4Q measures the response time of its associated server. 

110, 112 and 114. Client 120 Is thus coupled to nameserver Illustratively, this may be accomplished by issuing a Ping (or 

100 via network 122 and includes instructions (e.g., a web similar) command to the server and measuring the response 

browser) for communicating via network 122. Client 120 time. As another example, where the selected policy requires 

further includes common components such as a processor, choosing the closest server the status object is illustratively 

memory, storage, input and output devices, etc. Such com- 45 configured to measure the number of hops from nameserver 

mon components are well known to those skilled in the art 100 to the object's associated server, 

and are omitted from FIG. 1 for the purpose of clarity. i n yet another embodiment of the invention, status objects 

Irnhejnyiron^ 200a, 2006 and 200c are configured to determine whether a 

web-browscrtaac cess-aj- er^ca^ particular service (e.g., web service, electronic mail service), 

chents-v^a^rtual-server^name, the access request is 50 application program or server is operational. Illustratively, 

received by nameserver 100. Nameserver 100, through DNS the status objects in this embodiment issue a Connect (or 

102, identifies a server to handle the request, jn- p artic ular, similar) command to the target service or server. If a 

D NS"102-accesscs-zoj iej2^rlO-4iaridrreln^ Connect command is successful the issuing object knows 

adoyess.o£a^erve roffering4he-rep Ucated,sem that the target is operational, otherwise it is assumed to be 

i nVlG7l7 ^hn]CTnternet- prolo^ yo Tb 55 inoperative. 

ofon^ok^ Illustratively, for each replicated service (or application) 

In one embodiment ofthe~present invention, the specific that is to be monitored (i.e., that is subject to load balancing) 

server identified in the zone file is determined according to on a server, a separate status object operates on nameserver 

a selected policy, as discussed below. Further, the server 100. In addition, each status object illustratively performs a 

identified in zone file 104 is updated from time to time in go single function (e.g., determine response time, determine a 

accordance with the selected policy in order to distribute server's distance from nameserver 100). In alternative 

client requests among the servers offering the replicated embodiments of the invention, however, a single status 

service. object may monitor multiple servers or services and/or 

In an alternative embodiment of the invention, instead of perform multiple functions, 

returning an address of a server, the DNS lookup in zone file 65 In FIG. 2, individual monitor objects (I MO) 202a, 2026 

104 returns an identifier (e.g., file name) of a set of execut- and 202c also reside and execute on nameserver 100. A 

able instructions. The executable instructions are executed, separate I MO is depicted for each instance of the replicated 
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service. In particular, IMOs 202a, 2026 and 202c invoke and 
collect information from status objects 200a, 2006 and 200c, 
respectively. Individual monitor objects may also be known 
as server monitor objects. Although FIG. 2 depicts only one 
status object associated with each IMO, depending upon the 
selected load-balancing policy (e.g., the criteria for choosing 
a preferred server), multiple status objects may be associated 
with an IMO. In such an environment, the IMO will invoke 
and/or collect information from each associated status 
object. 

In the presently described embodiment, different types of 
status objects are invoked with differing degrees of regular- 
ity. When the active status objects collect the servers* 
response times, for example, IMO 202a may collect infor- 
mation from status object 200a relatively frequently (e.g., 
every 60 seconds) to determine the response time of server 
110. In contrast, when the active status objects reflect a 
policy preferring the closest server, IMO 2026 may invoke 
status object 2006 only occasionally (e.g., once per day) 
because the distance from nameserver 100 to server 112 is 
unlikely to change very often. 

Although each IMO is associated with only one status 
object in the illustrated embodiment, in an alternative 
embodiment of the invention an IMO may invoke and 
collect data from multiple status objects. In this alternative 
embodiment, for example, an IMO may invoke one status 
object to determine the response time of a server or service 
and another status object to determine whether the server is 
operational (i.e., whether the server is up). Illustratively, the 
Ping command is used to determine whether a server is 
operational. If the server does not respond to the Ping 
command, it may be assumed to be down. 

Replicated monitor object (RMO) 204 retrieves the infor- 
mation collected by status objects from each IMO associated 
with one replicated service or application. Therefore, in the 
illustrated embodiment where each of servers 110, 112 and 
114 operate a separate instance of a replicated service (e.g., 
web browsing), RMO 204 collects data from IMOs 202a, 
2026 and 202c. If the servers also offered another replicated 
service (e.g., electronic mail) or application, a second RMO 
would illustratively operate on nameserver 100 for the 
purpose of retrieving information concerning that service 
from a different set of IMOs. A replicated monitor object 
may also be known as a central monitor object due to its 
coordination role on behalf of a central server (e.g., 
nameserver 100) receiving multiple requests for a replicated 
service or application. 

Thus, for each replicated service or application for which 
load balancing is performed in accordance with present 
embodiments of the invention, a status object collects load 
and/or operational information from each server executing 
an instance of the service or application. In addition, an IMO 
exists for each instance of the replicated service and an 
RMO operates for each service or application. 

The data collected by RMO 204 from the various IMOs 
is analyzed in accordance with the selected policy and a 
preferred server is identified. As discussed above, the pre- 
ferred server may, for example, be the one having the fastest 
response time (and which is thus likely to be the least-loaded 
server) or the one closest to nameserver 100. Illustratively, 
RMO 204 maintains a data structure (e.g., array, vector, 
table, database) identifying each server and/or instance of 
the replicated service that is being load-balanced, along with 
one or more values or other indicators or summaries of the 
collected information concerning each server (or service 
instance). 
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Finally, in the illustrated embodiment DNS updater object 
206 gathers and analyzes data from RMO 204 and updates 
zone file 104 after the collected information is analyzed and 
a preferred server is selected. In this embodiment, RMO 204 
retrieves the collected data and DNS updater 206 updates the 
zone file on a periodic basis. Illustratively, if the selected 
policy specifies the use of the closest server, RMO 204 and 
DNS updater 206 need not take action as often as they do 
when the policy requires the use of the server with the fastest 
response. 

FIG. 3 depicts an illustrative embodiment of the invention 
employing an intrusive mode of operation. In this mode of 
operation, status objects and individual monitor objects 
reside and execute on the servers operating a replicated 
service or application. 

In FIG. 3, two replicated services or applications are 
offered among servers 110, 112 and 114. Thus, status objects 
300a and 302a collect load and/or operational data concern- 
ing a first replicated service (e.g., web browsing) or 
application, while status objects 3026 and 3046 collect load 
and/or operational data concerning a second replicated ser- 
vice (e.g., electronic mail) or application. 

Each server also operates an IMO for each resident 
instance of a replicated service for the purpose of receiving 
data from one or more status objects associated with the 
IMO. For example, FIG, 3 depicts IMO 312a coupled to 
status object 302a and IMO 3126 coupled to status object 
3026 on server 112. Thus, in the embodiment of the inven- 
tion depicted in FIG. 3, status objects and IMOs reside on 
individual servers that are being load-balanced, but perform 
substantially the same functions as in the embodiment 
depicted in FIG. 2. 

Replicated monitor object 320 interfaces with IMOs 310a 
and 312a and RMO 322 interfaces with IMOs 3126 and 
3146 to retrieve the necessary information concerning the 
replicated services. Various means of communication may 
be employed between the RMOs and IMOs. In a present 
embodiment of the invention Object Request Broker (ORB) 
technology is employed. In an alternative embodiment of the 
invention Remote Procedure Call (RPC) technology is used. 

DNS updater 330 also resides on nameserver 100 in the 
presently described embodiment and operates in substan- 
tially the same manner as described above. After the data 
concerning each instance of each replicated service is 
retrieved and analyzed, DNS updater 330 updates the DNS 
zone file to reflect the preferred server for each replicated 
service. Illustratively, one DNS updater is used to update the 
zone file for all replicated services being load-balanced. 
However, in an alternative embodiment of the invention 
separate DNS updaters may be employed for each replicated 
service or application, 

FIG. 4 depicts an alternative embodiment of the invention 
in which servers offering a replicated service or application 
are geographically dispersed. In FIG. 4, server farm 400 
represents a first collection of servers offering the replicated 
service or application and server farm 410 represents a 
second collection of servers offering the same service. 
Although server farms are depicted with multiple servers 
(i.e., servers 402 and 404 in server farm 400 and servers 412 
and 414 in server farm 410), a server farm may consist of 
any number of servers, even one. 

Each server farm in the presently described embodiment 
also includes an intermediate server (i.e., server 406 in 
server farm 400 and server 416 in server farm 410). One 
function of an intermediate server in this embodiment is to 
collect, from the other servers in the farm that are offering 
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the replicated service, the information necessary to select a 
preferred server. For example, intermediate replicated moni- 
tor object (IRMO) 406a is operated on intermediate server 
406 to collect data from servers 402 and 404. IRMO 406a 
thus operates similarly to the RMOs described above, but is 5 
illustratively located on a server situated between 
nameserver 100 and the servers offering the replicated 
service. As described in conjunction with FIG. 3, status 
objects (e.g., depicted by numerals 402a, 404a, 412a and 
414a) and IMOs (e.g., depicted by numerals 4026, 4046, 10 
4126 and 4146) operate on servers 402, 404, 412 and 414. 

RMO 420 operates on nameserver 100 to collect data 
from the I RMOs within each server farm (e.g., IRMO 406 
and 416). DNS updater 422 updates zone file 104 to reflect 
the preferred server identified from the data collected by 15 
RMO 420. 

In an alternative embodiment of the invention in which a 
replicated service is offered on multiple servers, one or more 
of which are local and one or more of which are remote, 
aspects of the embodiments of the invention depicted in 20 
FIGS. 3 and 4 are combined. In this alternative embodiment, 
intermediate servers with IRMOs are employed in server 
farms comprising the remote servers to pass data between 
the remote servers 1 IMOs and an RMO, as in the embodi- 
ment depicted in FIG, 4. Local servers, however, employ 25 
IMOs that communicate with the RMO ithout an intervening 
RMO, as in FIG. 3. 

In another alternative embodiment of the invention, load 
balancing for a replicated service is performed among 3Q 
multiple participating servers wherein one or more of the 
servers are segregated (e.g., situated in a remote location 
and/or within a server farm). Within the group of segregated 
servers, a "local" load balancing policy may be implemented 
for distributing among the servers all client requests sent to 35 
the group (or to any member of the group). In this alternative 
embodiment, the segregated servers may be considered a 
single entity for the purposes of a "global" load balancing 
policy specifying the manner in which all client requests for 
the replicated service are to be distributed among all par- 4Q 
ticipating servers. The global and local policies need not be 
equivalent (e.g., the global policy may require selection of 
the closest server (or group of servers) while the local policy 
may require the least-loaded server). 

With reference now to FIG. 5, an illustrative method of 45 
load balancing between multiple instances of a replicated 
service is depicted in a flow chart. The replicated service 
illustratively comprises Internet access to data and content 
(e.g., web pages) through a virtual server name. A DNS 
server resolves client requests for the virtual server name to 50 
an identifier of a server configured to satisfy such requests. 
Each instance of the replicated service operates on a separate 
server. State 500 is a start state. 

In state 502 a policy to be applied during the load 
balancing is selected. Illustrative policies in a present 55 
embodiment of the invention focus upon the availability or 
status of the servers offering the replicated service. Such 
policies include shortest distance (i.e., client requests are to 
be routed to the server that is closest to the DNS server) and 
response time (i.e., client requests are to be routed to the $o 
server that offers the fastest response or that has the smallest 
load). 

In an alternative embodiment of the invention, policies 
are application-specific and are based upon specific aspects 
of an application being load-balanced. For example, where 65 
access requests for a database management system (DBMS) 
are load balanced, illustrative policies may include routing 
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requests to the server on which the fewest DBMS requests 
have been processed or the server having the fewest con- 
nected users or the fewest unfulfilled processing or access 
requests. For each application for which requests are load- 
balanced, separate policies may be employed. One skilled in 
the art will appreciate that this alternative embodiment may 
require the collection of application-specific data (as 
opposed to information concerning the host server in 
general). In such an event, the application may require 
modification. 

In state 504, a mode of operation is chosen. Illustrative 
modes of operation include intrusive or non -intrusive col- 
lection of information concerning a server (or replicated 
service or application offered by the server). 

In state 506, the policy for a replicated service or appli- 
cation is encapsulated into status objects or other computer- 
readable instructions. For example, if a policy of selecting 
the server having the fastest response time is selected, and 
a non-intrusive mode of operation is to be used, a status 
object is constructed to direct a Ping command (or similar 
networking test command) from the DNS server to a server 
offering the replicated service or application. Illustratively, 
the status object will also be designed to compute an amount 
of time that elapses between the time the Ping command is 
issued and a response is received. 

In contrast, if an intrusive mode of operation is to be used, 
a status object reflecting a fastest-response policy is con- 
structed to execute on a server (e.g., a server offering the 
replicated service) in order to ascertain the number of 
requests pending on the server. 

As discussed above, in a current embodiment of the 
invention status objects are constructed' using an object- 
oriented programming language. One skilled in the art will 
recognize that many suitable programming languages and 
tools exist and that the invention may be implemented using 
techniques other than object-oriented programming. 

In state 508 individual monitor objects are created or 
invoked. Illustratively, one I MO is generated for each 
instance of a replicated service. Depending upon whether the 
replicated service is to be load-balanced intrusively or 
non-intrusively, the IMO objects are installed on either the 
individual servers offering the service, the DNS server, or 
some intermediate computer system. As. described above, 
IMO objects may be configured to invoke one or more status 
objects and collect and report certain information or data. 
The collected information may include a server's load (e.g., 
number of requests waiting and/or being processed), capac- 
ity (e.g., the number of requests the server can handle), 
operational status (e.g., whether the server is up or down), 
etc. 

For effective load balancing, information is collected and 
processed as described above to identify a preferred server 
in accordance with a policy. The preferred server then 
receives requests for the replicated service until a different 
preferred server is identified. In a present embodiment of the 
invention, the active policy for a replicated service or 
application may be changed without disrupting the handling 
of client requests. Illustratively, this is done by temporarily 
pausing the operation of IMOs for the service, installing new 
status objects reflecting the new policy, then resuming the 
IMOs. Advantageously, the IMO objects need not be altered. 

Similarly, a server may easily be removed from load 
balancing in accordance with the embodiments of the inven- 
tion discussed herein. A server may, for example, become 
inoperative or be replaced by another server. I llustra lively, 
an RMO maintains a list (array, linked list, vector, etc.) of all 
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servers participating in the load balancing (e.g., all servers to limit the invention to the forms disclosed. Many modi- 
offering an instance of the replicated service or application). fications and variations will be apparent to practitioners 
By temporarily pausing the RMO, removing a server from skilled in the art. Accordingly, the above disclosure is not 
the list and restarting the RMO, the RMO will stop attempt- intended to limit the invention; the scope of the invention is 
ing to retrieve information from the removed server (i.e., the 5 defined by the appended claims. 
RMO will stop attempting to communicate with an IMO on What is claimed is: 

the server). Servers may be added to the load-balancing 1. A method of balancing requests for a replicated service 

scheme in a similar manner. among a plurality of servers, wherein the requests are 

In state 510 a replicated monitor object is created for each received at a central server, the method comprising: 

replicated service or application to be load-balanced. As 10 selecting a policy, said policy comprising one or more 

described above, the RMO is illustratively installed on the factors for selecting a preferred server to receive a 

DNS server and communicates with IMOs using a suitable request for the replicated service, wherein said one or 

format or protocol (e.g., ORB or RPC). In an alternative more factors includes a first factor; 

embodiment in which intermediate servers are employed operating a first status module to determine a status of said 

(e.g., where remote servers or server farms are included), an 15 gjsj f ac tor for a first server- 

intermediate RMO is generated for each intermediate server. opeming a second slatus module t0 determine a status of 

Then, in state 512 a specialized object is generated to ^ ^ factor for a server; 

apply the results of the data collected from the replicated . . . , 

rr . , . , t . c c . , w , r r receiving said first server status at the central server; 

service servers and identify a preferred server. Where, for ° 

example, the replicated service includes internet access to 20 receiving said second server status at the central server; 

web servers, a DNS updater is configured on the DNS server examining said first server status and said second server 

to modify the DNS lookup table (e.g., a zone file) to reflect status to select a preferred server; 

the server to which requests are to be routed. Similarly, storing an identifier of said preferred server on the central 

where load balancing is being performed for an application server; and 

operating in a master/slave relationship (e.g., a master 25 directing a request for the replicated service received at 

process or server routes requests to slave processes or tne central server to said preferred server, wherein said 

servers), the specialized object updates a data structure or directed request is received after said storing, 

entry indicating a preferred process or server. 2. The method of claim 1, further comprising maintaining 

After the various executable objects or program modules ^ a server monitor module to receive said first server status 

are configured and installed, the collection of server/service from said first status module. 

information can begin. Therefore, in state 514 an IMO 3. The method of claim 2, wherein said policy comprises 

invokes or calls a status object, illustratively to determine a a second factor for selecting a preferred server, the method 

server or replicated service's status or to retrieve data further comprising: 

concerning the server's load. In a present embodiment of the ^ operating a third status module to determine a first status 

invention, both the status object and IMO execute on the 0 f sa j d seC ond factor for said first server; 

same computer system (e.g., a server offering a replicated wherein said monitor module alsQ receiyes M 

service or application). first staU)S of Mkj second fac(or 

In state 516 the status object returns the information it was 4, xhe method of claim 2, wherein said server monitor 

configured to gather and the IMO stores the information. In 4Q module executes on said first server, 

stale 518 an RMO calls or otherwise communicates with the 5 ^ rael hod of claim 2, wherein said server monitor 

IMO to retrieve the information it has stored. The RMO may module executes on the central server, 

similarly communicate with additional IMOs storing infer- 6 mel hod of claim 1, further comprising maintaining 

mation concerning other servers or instances of the repli- a cen tral monitor module for retrieving said first server 

cated service. Illustratively, the RMO executes on a DNS 45 slams and said stalus 

server and stores the information retrieved from the IMOs 7, Th e method of claim 6, wherein said central monitor 

for analysis. module executes on the central server. 

In state 520 the information retrieved by the RMO is 8. The method of claim 1, wherein said operating a first 

analyzed in accordance with the selected policy to choose a status module comprises invoking a first status module 

preferred server. Depending upon the information, the analy- 50 residing on the first server. 

sis may consist of identifying the server or the instance of a 9. The method of claim 1, wherein said operating a first 

replicated service having the shortest response time, the status module comprises pinging a first server by the central 

server that is located the fewest hops from the DNS server, server and operating a second status module comprises 

the server having the lightest load (e.g., the number of HTTP pinging a second server by the central server. 

(Hypertext Transport Protocol) requests the server has 55 10. The method of claim 1, further comprising: 

received, perhaps over a particular period of time). selecting a local policy for a subset of the plurality of 

In state 522 the zone file for the DNS server is updated to servers, said local policy specifying a second factor for 

indicate the preferred server. Illustratively, the update pro- selecting a server to receive a request for the replicated 

cedure comprises associating a network address of the service. 

preferred server with the name of a virtual server/service 60 11. The method of claim 1, wherein for each of said 

through which clients access the replicated service or appli- plurality of servers, a status of each of said one or more 

cation. In addition,* in a present embodiment of the factors is determined by a separate status module, 

invention, the DNS server is signaled to reload the zone file. 12. The method of claim 1, wherein said operating a first 

State 524 is an end state. status module comprises issuing a Connect command to a 

The foregoing descriptions of embodiments of the inven- 65 first server by the central server and operating a second 

tion have been presented for purposes of illustration and slatus module comprises issuing a Connect command to a 

description only. They are not intended to be exhaustive or second server by the central server. 
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13. A method of load balancing requests for a replicated 
service received at a central server among a set of servers, 
comprising: 

selecting a policy for directing a request for the replicated 
service to a preferred server, wherein said policy sped- 5 
fies a server factor for selecting said preferred server 
from the set of servers; 

executing a first status object on the central server to 
determine a first status of said server factor for a first 
server in the set of servers; 10 

executing a first server monitor object on the central 
server to receive said first status; 

executing a central monitor object on the central server to 
receive multiple statuses of said server factor for mul- JS 
tiple servers in the set of servers, including said first 
status; 

examining said multiple statuses to select a preferred 
server; and 

updating the central server to identify said preferred 20 
server. 

14. The method of claim 13, wherein the set of servers 
includes a first subset, the method further comprising: 

configuring an intermediate central monitor object on an 
intermediate server to collect one or more statuses of 25 
said server factor for one or more members of said first 
subset; and 

receiving said one or more statuses at the central server 
from said intermediate central monitor object. 

15. The method of claim 14, further comprising selecting 30 
a local policy for balancing requests for the replicated 
service among the members of the first subset according to 

a local server factor. 

16. The method of claim 15, wherein said local server 
policy is different from said policy. 35 

17. The method of claim 13, wherein said central server 
comprises a domain name server, further comprising updat- 
ing a lookup table associated with the domain name server 
to associate said preferred server with the replicated service. 

18. The method of claim 17, wherein said lookup table 40 
comprises a zone file and said updating comprises storing a 
network address of said preferred server to facilitate direct- 
ing a future request for the replicated service from the 
central server to said preferred server. 

19. An apparatus for balancing requests for a replicated 45 
service among multiple servers, wherein the requests are 
received at a central server, comprising: 

a first server configured to operate a first instance of the 

replicated service; 
a second server configured to operate a second instance of 50 

the replicated service; 
a first status module on the central server-configured to 

determine a first status of said first server; 
a second status module on the central server-configured to 55 

determine a second status of said second server; 
a first server monitor module on the central server con- 
figured to invoke said first status module and receive 

said first status; 
a second server monitor module on the central server 60 

configured to invoke said second status module and 

receive said second status; 
a central monitor module configured to receive said first 

status and said second status; 
a preferred server identifier configured to identify a pre- 65 

ferred server for receiving a future request for the 

replicated service; and 
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an update module configured to update said preferred 
server identifier to indicate one of said first server and 
said second server to receive a request for the replicated 
service. 

20. The apparatus of claim 19, wherein said first status 
module is a Ping command issued by the central server to 
said first server. 

21. The apparatus of claim 19, wherein the central server 
comprises said central monitor module' and said update 
module. 

22. The apparatus of claim 19, further comprising a server 
farm, said server farm comprising: 

one or more servers; and 

an intermediate central monitor module configured to 
receive a status of one of said one or more servers and 
communicate said status to said central monitor mod- 
ule. 

23. The apparatus of claim 19, wherein said first status 
module is a Connect command issued by the central server 
to said first server. 

24. An apparatus for load balancing requests for a repli- 
cated service received at a central server, comprising: 

a first status determination means for determining a first 
status of a first server offering the replicated service; 

a second status determination means for determining a 
second status of a second server offering the replicated 
service; 

a first server monitor means for invoking said first status 
determination means; 

central monitor means for receiving said first status and 
said second status; 

server selection means for selecting a preferred server 
from one of said first server and said second server; and 

updating means for storing an identifier of said preferred 
server on the central server; 

wherein one or more requests for the replicated service 
received after said updating are directed to said pre- 
ferred server. 

25. The apparatus of claim 24, wherein said first status 
determination means and said first server monitor means are 
located on said first server. 

26. A method of load balancing requests for a replicated 
service received at a central server among a set of servers, 
comprising: 

selecting a policy for directing a request for the replicated 
service to a preferred server, wherein said policy speci- 
fies a server factor for selecting said preferred server 
from the set of servers; 

configuring a first status object on a first server in the set 
of servers to determine a first status of said server factor 
for said first server; 

configuring a first server monitor object on said first 
server to receive said first status; 

configuring a central monitor object on the central server 
to receive multiple statuses of said server factor for 
multiple servers in the set of servers, including said first 
status; 

examining said multiple statuses to select a preferred 
server; 

updating the central server to identify said preferred 
server; and 

directing a request for the replicated service received at 
the central server to said preferred server, wherein said 
directed request is received after said updating. 
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27. The method of claim 26, further comprising: 
invoking said first status object; 

storing said first status with said first server monitor 
object; and 

receiving said first status at the central server, by said 
central monitor object, from said first server monitor 
object. 

28. The method of claim 26, wherein the set of servers 
includes a first subset, the method further comprising: 1Q 

configuring an intermediate central monitor object on an 
intermediate server within the first subset to collect one 
or more statuses of said server factor for one or more 
members of the subset; and 

receiving said one or more statuses at the central server is 
from said intermediate central monitor object. 

29. The method of claim 28, further comprising selecting 
a local policy for balancing requests for the replicated 
service among the members of the subset according to a 
local server factor, wherein said local server factor is dif- 20 
ferent from said server factor. 

30. A computer readable storage medium storing instruc- 
tions that, when executed by a computer, cause the computer 
to perform a method for balancing requests for a replicated 
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service among a plurality of servers, wherein the requests 
are received at a central server, the method comprising: 
selecting a policy, said policy specifying a server- 
selection factor for selecting a preferred server to 
receive a request for the replicated service; 
invoking a first status module to determine a first server- 
selection factor of a first server; 
invoking a second status module to determine a second 

server-selection factor of a second server; 
receiving said first server-selection factor at the central 
server; 

receiving said second server-selection factor at the central 
server; 

examining said first server-selection factor and said sec- 
ond server-selection factor to select a preferred server; 
storing an identifier of said preferred server on the central 
server; and 

directing a request for the replicated service received at 
the central server to said preferred server, wherein said 
directed request is received after said storing. 
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